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' Digital Imaging — How Libraries, Museums, and Other Image Banks Are 
Managing a Digital World 

By Robert Lucas 
Encinitas Center 

The dazzling success of the Web is rapidly creating a digital image-filled culture. There are now literally millions 
of historic and artistic images out there on the Internet waiting for people to access and use. Another feature in the 
evolving ecology of the digital world is the rapidly growing population of content providers anxious to put these 
images to use. 

On the other side of the digital equation are the image providers. For them, the most important thing is being found. 
They have the digital representations of images they want you to see. But between those millions of images and the 
would-be end users the connection can be anything but smooth. 

Images in Search of Viewers 

Just try, for instance, finding even a well-known image. Using the specific search terms, "Madonna and Chair," 
"Raphael," and "painting," my favorite search engine immediately yielded an art course with no image links; offers 
to purchase music CDs of the pop icon Madonna, and an explanation of why a home for troubled children chose the 
image in question for their Home page. Eventually, I did find a good-sized image of the Renaissance classic and 
some good information about it. But it took forever. 

Different search engines yield different results for many reasons, but so far none is better than others. Whatever 
your reason is for searching an image, you probably want to get the best view and the most useful information 
about it. Maybe you want a copy of it for your own use. 

To try to find the smoothest path between image and user, we took a look at four commercial and educational 
"image banks" and they ways they approach the challenges in digitizing and describing visual content. It's new 
territory, and maps are sketchy so far. Digital libraries are just now moving toward a set of standards about such 
questions as the right size of the image files and more crucially, the best types of search criteria or metadata that 
will be used to describe images. 

Five years ago, if Apple Computer or its ad agency had been interested in finding an image of Albert Einstein to 
use in a new ad campaign, it probably would have called on the Bettmann Collection, the largest commercial, 
historical and news photo collection in the world. In those pre-digital days, a researcher at Bettmann would gather 
information about the desired image, go through several card catalogs, record the image codes, then go through 
rooms of file cabinets, holding images from prehistoric cave paintings to modern-day photojournalism. After 
gathering the appropriate set of prints, transparencies, or glass negatives from the files, the researcher would 
package them, label them carefully, and then mail, drive, or overnight them to wherever they were needed. 

Corbis Corp. — Putting Images First 

Today, new technologies have given publishers the ability to view, search, purchase, and interact with those same 
images in minutes. Corbis Corp. of Bellevue, Wash., a private Bill Gates-owned company, now owns the Bettmann 
Collection and is providing Apple as well as clients like A&E, Discovery, Simon & Shuster, and Reed with the 
rights to images they want. Corbis saw the benefits of digital imaging coming, got ready, and is now way out in 
front of the adoption curve. 

Last spring, Corbis purchased the digital rights to the entire Ansel Adams collection of 2,500 photographs, adding 
to the fine arts library that includes works from the National Gallery of London, the Hermitage Museum in Russia, 
and the Kimbell Museum in Fort Worth, Texas. The digital library, now numbering 13 million images, is a long- 
term investment for Gates, the chairman of the Microsoft Corp. 

The past eight years have been spent building the infrastructure of Corbis. The site has assembled the largest 
concentration of flatbed scanners in the world, as well as an army of image management and library science 
professionals. Along with images being scanned, edited, and stored in a large multimedia database. At one time we 
were scanning at a rate of 10,000 images per week." says Jim DeNike, Manager of Corporate Communications. 
"After reaching a critical mass of digital images, we have scaled back to a rate of 3,000 images per week." 
A client logs on to the Corbis Web site (http://www.corbis.com ) and, using their proprietary software, can browse 
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through an index of categories: historical views, world art, entertainment, contemporary life, science and industry, 
animals, nature scenes, travel and culture, collections, and special events. Search and retrieval is possible using text 
or image-based links tied to the metadata. The search is fast, accurate, and the desired image can be posted to the 
client's password protected area online. 

Corbis has attempted other business ventures tied to its huge collection, but developing good content and 
developing content that sells are two very different things. After startup, Corbis 's content developers built 
multimedia projects around its art collections, starting with: A Passion for Art: Renoir, Cezanne, Matisse, and Dr. 
Barnes. This and other CD-ROMs such as, Paul Cezanne: Portrait of My World, and Leonardo da Vinci won 
multimedia awards and rave critical reviews, but could not compete for retail shelf space with games and business 
software. The CD-ROM market could not support these products, and they now are licensed to distribution 
companies and museum stores. Corbis has no plans for farther excursions into this market. 
CD-ROM may be out, but Web sites are still in. Corbis used 7,000 images from its collection to create an 
interactive online product, called Trip, which profiles 100 exotic locations around the world. Banner space links 
peripatetic viewers to Corbis advertisers and business partners in travel-related products. An online print and poster 
shop, which can also accessed from its home page, provides an outlet for a substantial number of Corbis images, 
framed or unframed. Consumers can customize and preview their prints and frames online before they order. 
So, while being primarily an image provider for media professionals, Corbis is using the Web to branch out into 
various consumer markets, as well as to generate revenue though banner advertising around its compelling images. 
Another obsession at Corbis is conservation, and digitization is helping here too. "Whenever you get a collection 
that's old, preservation becomes a concern," says DeNike. "The negatives literally break down and decompose. So 
we have an extensive preservation process underway and someone who is dedicated solely to that project at both 
our New York and Seattle offices. One of the benefits of digitization is that it does allow us to eternally preserve 
what we have in analog format." 

Once products are digitally distributed over the Internet, it's easy for content to fall into the hands of unlicensed 
users. How does a company protect its digital visual assets? Corbis, and other visual content providers deal with 
would-be image thieves through the use of digital watermarks. 

The practice of watermarking documents dates back to the Middle Ages, when Italian paper makers marked their 
unique pieces of paper to prevent others from falsely claiming craftsmanship. Today, watermarks are still used to 
identify stationery and stock for bank checks. Like its analog counterpart, digital watermarking carries information 
about the source along with the content. A pattern of bits is inserted into a digital image, scattered throughout the 
file in such a way that it cannot be identified or manipulated. To view a watermark, you need a special program that 
knows how to extract the watermark data. Corbis uses Digimarc's watermarking technology, which is integrated 
into Photoshop and other graphics programs, allowing it to save images with copyright information embedded. 
Digimarc can then identify Web sites that use those images legally or illegally. The company's MarcSpider crawls 
the Web looking for watermarked images and reports its findings. There's no escaping this electronic arachnid. 
MarcSpider reports back with a thumbnail picture of found images, other image file data, and a hyperlink to their 
location. 

SCRAN — Protecting Owners, Encouraging Users 

Protecting Intellectual Property Rights (IPR) was "the largest obstacle to overcome in making SCRAN a success." 
So says Graham Turnbull, Publishing Manager of The Scottish Cultural Resources Access Network, a Millennium 
Project intended to provide access to Scotland's rich human history and culture. 

"Owners of content have been discouraged from distributing or in some cases even digitizing it for fear of being 
ripped off in some way," says Turnbull. Within the areas which it can influence, the SCRAN project has been 
attempting to simplify the situation. The network has developed a Licensing Model which appears to be working 
well and which others might like to emulate. SCRAN is translating the material culture of Scotland into a Web- 
based, networked multimedia databank. Founded by several of Scotland's cultural institutions, the network is a 
nonprofit organization funded by the UK National Lottery. 

By the year 2001, SCRAN expects to have digitized more than 100,000 multimedia objects and made them 
available on its networked server (http://www.scran.ac.uk) for educational re-use. The digitizing projects range 
from the Impressionist paintings of the Glasgow Art Galleries, to maps and aerial photographs held by the Royal 
Commission on the Ancient and Historic Monuments, to scenes from rural life contributed by the Highland Folk 
Museum. 

SCRAN's licensing model protects the IPR holders and their assets, while fulfilling the organization's educational 
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outreach mission. Licensing fees should also keep SCRAN operating after 2001 when the Lottery's £15 million 
organizational grant expires. 

Here's how SCRAN works: SCRAN issues grants to cultural organizations, which contractually purchase the 
educational license to the digitized- material and allow contributors to develop and process images, text, video, 
plans, animations, and virtual reality environments into the SCRAN collection. SCRAN then makes the digital 
assets available on the National Grid for Learning. 

Thumbnail images are freely and openly available to all on the Web. Images of 72 DPI, 256-color quality, along 
with other multimedia resources, are available to licensed educational institutions where teachers and students can 
download material for educational use. As a measure of future-proofing, the actual digitization is done at a far 
higher level than is necessary for these current educational uses, so that they can be upgraded as future 
technological developments demand. The high-quality images are also available for commercial exploitation for a 
fee, which will be passed back to the original copyright holders. 

"Rights of the highest-quality digital image remain with the original owner of the object, and copies may only be 
released under specific circumstances agreed with the rights holder," says Turnbull. "In practice, we expect that 
many institutions will come to regard SCRAN as a sort of clearing agency for commercial reproduction rights, the 
fees being passed back to the IPR holder minus a small handling charge." 

What is equally important about the SCRAN project is the nature of its metadata and in the steps it is taking to 
prepare for future. By the year 2001, the SCRAN Resource Base will contain 100 Multimedia Essays, 100,000 Full 
Datasets, and at least 1.5 million Basic Records. 

While the hub of SCRAN is its central resource base, the appropriate technology for delivering electronic 
educational resources for many schools remains the CD-ROM. SCRAN will fund production by numerous other 
cultural organizations with collections of artifacts of up to 100 publications on CD-ROM, DVD, or other media. 
SCRAN uses the term Multimedia Essay to describe such work. 

In a program to make the images more accessible, the basic information about them is recorded by SCRAN in a 
standard called the Dublin Core. The problem is that museums, libraries, galleries, and archives all have had some 
form of documentation or collections management system. But these systems were developed for internal purposes 
to assist the smooth running within the organization. Museum records differ in structure and vocabulary from 
archive records, and they are intended for staff or specialist researchers, not the public. SCRAN is a major driving 
force in the development of a seamless resource access process. A project is underway to gather data about the 
information in all the different systems, in compliance with an emerging standard core set of metadata elements. 
These elements, known as the Dublin Core, mean that an end user will be able to search across domains. 
SCRAN is an image database, but it is also a metadata repository, containing Basic Records that act as pointers to 
SCRAN's own digitized resources, to more detailed domain-specific records on other servers, to other Web sites, 
and to visitor destinations in the real world. 

A user accessing the SCRAN database will, in time, have a hot link to any of 1 .5 million records of museums and 
libraries around the world. 

Interoperability means to be able to seemlessly retrieve data from more than one source. Standards like the Dublin 
Core are attempts to make interoperability a reality. Roy Tennant, Digital Library Project Manager at UC Berkeley, 
describes the metadata environment as "islands of information, which though rich in their own knowledge ecologies 
are separated from one another by gaps in standards and protocols. For anyone researching and searching for 
images, the process is akin to island hopping from one resource to another, because the resources simply are not 
connected." 

Until a universally accepted standard finally does emerge, libraries can only follow their best hunches in creating 
metadata for their collections. And the standards of today must be able to make the move to future improvements in 
technology and information management. In theory, the metadata about the information being digitized would be as 
comprehensive as possible for the next versions of knowledge structures. For a library, though, the most costly part 
of the digitizing process is staff time. So the goal of interoperability must be balanced with costs. 
The UC Berkeley Digital Library (http://sunsite.berkeley.edu/) is developing the technologies for intelligent access 
to massive, distributed collections of photographs, satellite images, maps and full-text documents. Already, it has 
digitized collections of thousands of photographs related to California heritage, aerial photos of San Francisco and 
Yosemite National Park, and the Emma Goldman and Jack London collections. Its SunSITE location is an 
incredible resource for finding digital libraries and their collections. 

Tennant says, "There are many ways to catalog digital imagery. And it's only now that standards for digital 
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imagery are emerging." There are two general approaches on how to categorize the collections of images. One is 
group-level cataloging. Collections can be described by a set of records for logical groupings of items. An example 
of a grouping might be: views of Yosemite taken by a particular photographer on a particular assignment. While 
there is no way to distinguish between the images without viewing them, the user could likely find the single 
catalog record linked to several images. 

Another approach would be to prepare an individual, complete record for each item in the collection. To 
accomplish that, a librarian might have to research the name of the Yosemite landmark portrayed. And that would 
be easy, compared to the metadata that should accompany a record. It was once estimated that it would take the 
entire cataloging staff from all Berkeley libraries about 400 years to catalog a collection of 3.5 million images. 
These are the kinds of issues facing any library which is attempting to digitize its collections. 
Apart from the content, there is also administrative metadata which should be included in an image's listing, 
particularly its size (number of bits, DPI, or thumbnail). Selecting the resolution and the file size at which a file is 
actually saved involves several considerations. The two largest factors for consideration are funds available and 
storage capacity. Given those limitations, the first task is to capture the best possible image. Tennant has a range of 
choices. To archive collections, sometimes 2400 DPI is used. But generally, a reference copy is made at 600 DPI. 
From this copy, an organization can spin off a 72-DPI image for the Web and resize that for a thumbnail. The point, 
however, is that there is no standard as yet. Images are typically saved at anywhere between 300-600 or up to 2400 
DPI. 

Digitizing Materials for a Cyber Gallery Space — The Chicago Historical Society 

Some institutions don't have the resources to digitize their collections, but that does not mean that they can't create 
a Web site, using selected images and primary text documents to tell compelling stories. 

In preparation for the 125 th anniversary of the Great Chicago Fire, the Chicago Historical Society was looking for a 
way to commemorate and remember the historic event. They ended up with a completely electronic exhibit which 
patrons visited from their desktops, (http://www.chicagohs.org/fire/intro/gcf-index.html) Carl Smith, an expert on 
the Chicago Fire and Professor of English and American Culture at Northwestern University, was asked by the 
Historical Society to arrange an exhibit. He spent a few months searching archives at CHS to find the 300 images 
and 60 primary text documents which would be used, and then worked with a small technical team to create a 
framework for an extensive amount of historical information. He was equally interested in how the fire is 
remembered, and how that memory of the fire has been used by people over the years. The team took four months 
to set up a Web site that has been visited by a large number and wide range of people, from grade school kids to 
retirees. The site has been so successful that the historical society plans to leave it up indefinitely. 
The only concession to real space that this cyber-exhibition made was to install six kiosks in the CHS Gallery, so 
that gallery visitors could view the electronic site in that space. The Web site is divided into eleven chapters, each 
containing an essay by Dr. Smith; a Gallery of photographs, lithographs, and posters accompanied by detailed 
captions; a library of poems, songs, newspaper articles, and even stereographic images in the galleries that were 
converted by Art Director Paul Hertz to red/blue (anaglyphic) stereographs to duplicate the 3D illusion. 
Joe Germuska, the Learning Technologies Specialist who supervised production, described some of the design 
concerns. "Carl Smith is a great writer and was adamant that we not make concessions to being on the Web. Some 
people say, 'Hey, it's a digital medium; you have to write differently,' but he was firm and insisted that it be written 
to the readership of the New York Times." 

According to Germuska, a traditional exhibition could never have handled the rich blend of materials that are 
available on this site. "People can't stand at the wall of an exhibit reading a primary source document like fire- 
survivor William Gallagher's 40-page letter, or the lead story from the Chicago Tribune's first edition after the fire. 
We may have sacrificed some scale and texture by creating an electronic exhibition, but we gained by integrating 
the commentary about the images and the essays." They also created a virtual library, which is useful to both the 
general public and researchers. 

Libraries, museums, and image collections of all sorts are making their way online, providing a wealth of image 
resources. Organized to serve different communities and different purposes, they remain islands of information. 
While it will take some island-hopping to find the image for which you are searching, with some perseverance 
you're likely to find what you want. 

Robert Lucas is president of Encinitas Center in San Rafael Calif 
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